AITopics | community label

Exact recovery and Bregman hard clustering of node-attributed Stochastic Block Model

Neural Information Processing SystemsFeb-14-2026, 21:53:42 GMT

However, in many scenarios, nodes also have attributes that are correlated with the clustering structure. Thus, network information (edges) and node information (attributes) can be jointly leveraged to design high-performance clustering algorithms. Under a general model for the network and node attributes, this work establishes an information-theoretic criterion for the exact recovery of community labels and characterizes a phase transition determined by the Chernoff-Hellinger divergence of the model.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)

Add feedback

CorrelatedStochasticBlockModels: ExactGraphMatching withApplicationstoRecoveringCommunities

Neural Information Processing SystemsFeb-10-2026, 22:08:17 GMT

Wederivethe precise information-theoretic threshold for exact recovery: above the threshold there exists an estimator that outputs the true correspondence with probability close to1,while belowitnoestimator canrecoverthetrue correspondence with probabilityboundedawayfrom0.

artificial intelligence, graph, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.04)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Exact recovery and Bregman hard clustering of node-attributed Stochastic Block Model

Neural Information Processing SystemsDec-26-2025, 03:46:36 GMT

Classic network clustering tackles the problem of identifying sets of nodes (communities) that have similar connection patterns. However, in many scenarios nodes also have attributes that are correlated and can also be used to identify node clusters. Thus, network information (edges) and node information (attributes) can be jointly leveraged to design high-performance clustering algorithms. Under a general model for the network and node attributes, this work establishes an information-theoretic criteria for the exact recovery of community labels and characterizes a phase transition determined by the Chernoff-Hellinger divergence of the model. The criteria shows how network and attribute information can be exchanged in order to have exact recovery (e.g., more reliable network information requires less reliable attribute information). This work also presents an iterative clustering algorithm that maximizes the joint likelihood, assuming that the probability distribution of network interactions and node attributes belong to exponential families. This covers a broad range of possible interactions (e.g., edges with weights) and attributes (e.g., non-Gaussian models) while also exploring the connection between exponential families and Bregman divergences. Extensive numerical experiments using synthetic and real data indicate that the proposed algorithm outperforms algorithms that leverage only network or only attribute information as well as recently proposed algorithms that perform clustering using both sources of information. The contributions of this work provide insights into the fundamental limits and practical techniques for inferring community labels on node-attributed networks.

exact recovery and bregman, information, node-attributed stochastic block model, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.62)

Add feedback

Variational Estimators for Node Popularity Models

Karki, Jony, Huang, Dongzhou, Zhao, Yunpeng

arXiv.org Machine LearningNov-25-2025

Node popularity is recognized as a key factor in modeling real-world networks, capturing heterogeneity in connectivity across communities. This concept is equally important in bipartite networks, where nodes in different partitions may exhibit varying popularity patterns, motivating models such as the Two-Way Node Popularity Model (TNPM). Existing methods, such as the Two-Stage Divided Cosine (TSDC) algorithm, provide a scalable estimation approach but may have limitations in terms of accuracy or applicability across different types of networks. In this paper, we develop a computationally efficient and theoretically justified variational expectation-maximization (VEM) framework for the TNPM. We establish label consistency for the estimated community assignments produced by the proposed variational estimator in bipartite networks. Through extensive simulation studies, we show that our method achieves superior estimation accuracy across a range of bipartite as well as undirected networks compared to existing algorithms. Finally, we evaluate our method on real-world bipartite and undirected networks, further demonstrating its practical effectiveness and robustness.

algorithm, block model, matrix, (16 more...)

arXiv.org Machine Learning

2511.17783

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Minnesota (0.04)
North America > United States > Colorado (0.04)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

770b3ecb70147a2d2f18d2964fafcdd5-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 22:38:43 GMT

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science > Data Mining (0.69)

Add feedback

Correlated Stochastic Block Models: Exact Graph Matching with Applications to Recovering Communities

Neural Information Processing SystemsAug-17-2025, 02:47:51 GMT

We consider the task of learning latent community structure from multiple correlated networks.

exact community recovery, graph, recovery, (13 more...)

Neural Information Processing Systems

Country: North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report (0.47)

Industry: Information Technology (0.93)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)

Add feedback

Strongly Consistent Community Detection in Popularity Adjusted Block Models

Yuan, Quan, Liu, Binghui, Li, Danning, Xue, Lingzhou

arXiv.org Machine LearningJun-10-2025

The Popularity Adjusted Block Model (PABM) provides a flexible framework for community detection in network data by allowing heterogeneous node popularity across communities. However, this flexibility increases model complexity and raises key unresolved challenges, particularly in effectively adapting spectral clustering techniques and efficiently achieving strong consistency in label recovery. To address these challenges, we first propose the Thresholded Cosine Spectral Clustering (TCSC) algorithm and establish its weak consistency under the PABM. We then introduce the one-step Refined TCSC algorithm and prove that it achieves strong consistency under the PABM, correctly recovering all community labels with high probability. We further show that the two-step Refined TCSC accelerates clustering error convergence, especially with small sample sizes. Additionally, we propose a data-driven approach for selecting the number of communities, which outperforms existing methods under the PABM. The effectiveness and robustness of our methods are validated through extensive simulations and real-world applications.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2506.07224

Country:

North America > United States > New York (0.04)
North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Exact recovery and Bregman hard clustering of node-attributed Stochastic Block Model

Neural Information Processing SystemsJan-19-2025, 08:38:47 GMT

Classic network clustering tackles the problem of identifying sets of nodes (communities) that have similar connection patterns. However, in many scenarios nodes also have attributes that are correlated and can also be used to identify node clusters. Thus, network information (edges) and node information (attributes) can be jointly leveraged to design high-performance clustering algorithms. Under a general model for the network and node attributes, this work establishes an information-theoretic criteria for the exact recovery of community labels and characterizes a phase transition determined by the Chernoff-Hellinger divergence of the model. The criteria shows how network and attribute information can be exchanged in order to have exact recovery (e.g., more reliable network information requires less reliable attribute information).

exact recovery and bregman, information, node-attributed stochastic block model, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.45)

Add feedback

Resampled Mutual Information for Clustering and Community Detection

Lim, Cheaheon

arXiv.org Artificial IntelligenceNov-20-2024

We introduce resampled mutual information (ResMI), a novel measure of clustering similarity that combines insights from information theoretic and pair counting approaches to clustering and community detection. Similar to chance-corrected measures, ResMI satisfies the constant baseline property, but it has the advantages of not requiring adjustment terms and being fully interpretable in the language of information theory. Experiments on synthetic datasets demonstrate that ResMI is robust to common biases exhibited by existing measures, particularly in settings with high cluster counts and asymmetric cluster distributions. Additionally, we show that ResMI identifies meaningful community structures in two real contact tracing networks.

data mining, information, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.03584

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)

Add feedback

Maximum Likelihood Estimation on Stochastic Blockmodels for Directed Graph Clustering

Cucuringu, Mihai, Dong, Xiaowen, Zhang, Ning

arXiv.org Machine LearningMar-28-2024

This paper studies the directed graph clustering problem through the lens of statistics, where we formulate clustering as estimating underlying communities in the directed stochastic block model (DSBM). We conduct the maximum likelihood estimation (MLE) on the DSBM and thereby ascertain the most probable community assignment given the observed graph structure. In addition to the statistical point of view, we further establish the equivalence between this MLE formulation and a novel flow optimization heuristic, which jointly considers two important directed graph statistics: edge density and edge orientation. Building on this new formulation of directed clustering, we introduce two efficient and interpretable directed clustering algorithms, a spectral clustering algorithm and a semidefinite programming based clustering algorithm. We provide a theoretical upper bound on the number of misclustered vertices of the spectral clustering algorithm using tools from matrix perturbation theory. We compare, both quantitatively and qualitatively, our proposed algorithms with existing directed clustering methods on both synthetic and real-world data, thus providing further ground to our theoretical contributions. Keywords: graph clustering, directed graphs, maximum likelihood estimation, spectral methods, matrix perturbation analysis, semidefinite programming. Authors are listed in alphabetical order. This is the corresponding author.

algorithm, graph, matrix, (16 more...)

arXiv.org Machine Learning

2403.19516

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Illinois > Champaign County > Champaign (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Filters

Collaborating Authors

community label

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Exact recovery and Bregman hard clustering of node-attributed Stochastic Block Model

CorrelatedStochasticBlockModels: ExactGraphMatching withApplicationstoRecoveringCommunities

Exact recovery and Bregman hard clustering of node-attributed Stochastic Block Model

Variational Estimators for Node Popularity Models

770b3ecb70147a2d2f18d2964fafcdd5-Paper-Conference.pdf

Correlated Stochastic Block Models: Exact Graph Matching with Applications to Recovering Communities

Strongly Consistent Community Detection in Popularity Adjusted Block Models

Exact recovery and Bregman hard clustering of node-attributed Stochastic Block Model

Resampled Mutual Information for Clustering and Community Detection

Maximum Likelihood Estimation on Stochastic Blockmodels for Directed Graph Clustering